Solution of an infection model near threshold 
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We study the Susceptible-Infected-Recovered model of epidemics in the vicinity of the threshold 
infectivity. We derive the distribution of total outbreak size in the limit of large population size 
N. This is accomplished by mapping the problem to the first passage time of a random walker 
subject to a drift that increases linearly with time. We recover the scaling results of Ben-Naim and 
Krapivsky that the effective maximal size of the outbreak scales as N 2 ^ 3 , with the average scaling 
as iV 1//3 , with an explicit form for the scaling function. 



Recently, Ben-Naim and Krapivsky[lJ (BN-K) studied 
the statistics of the size of an epidemic in the Susceptible- 
Infected-Recovered (SIR) model [2|, y, |j| when the infec- 
tivity is near its threshold value. When the infectivity is 
below threshold, an outbreak quickly dies out, infecting 
some finite number of individuals, essentially indepen- 
dent of the population size. Above the threshold, the to- 
tal average number of affected individuals reaches a finite 
fraction of the population. BN-K found that at thresh- 
old, the total average number of affected individuals is 
proportional to TV 1 / 3 , and that there are essentially no 
outbreaks which infect more than of order N 2 ^ victims. 
While presenting an argument justifying these scaling 
laws, no analytic calculations for the distribution of out- 
break sizes was given. In this paper, we present an exact 
formula for this distribution, in the limit of large popula- 
tion sizes, which of course exhibits the scaling properties 
found by BN-K. This calculation involves solving an aux- 
iliary problem, namely the first-passage time statistics [B| 
for a random walker released at x — 1 to be absorbed 
at the origin, given a small leftward drift which increases 
in magnitude linearly in time. This problem is one of 
the few such first-passage problems with time-dependent 
forcing 6] for which an analytic solution is available, and 
so is of independent interest. 

We begin with a description of the SIR model. The 
TV individuals in the population are divided into three 
subclasses: the susceptible pool, of size S; the infected 
(and infectious) class, of size I; and those recovered (and 
no longer infectible), of size R, with N = S + 1 + R. The 
disease is transmitted from an infected individual to a 
susceptible one with rate a/N, so that 



(S,I,R) a '^ N (S- 



1,1+1, R). 



(1) 



Infected individuals recover with a rate which we can take 
to be unity: 



{S,I,R) -4 (S,I- 1,R+ 1). 



(2) 



Of primary interest is the case where initially S = N — 1, 
I = 1, R = 0, so that the outbreak is sparked by a single 
infected individual. The outbreak of course terminates 
when the last infected individual recovers, and / returns 
to 0. 



This stochastic process is traditionally approximated 
(for large populations) by the classic SIR rate equations 



S = 

i = 

R = 



I 



(3) 



Since S decreases monotonically, these equations are eas- 
iest dealt with by eliminating the time and considering 
dI(S)/dS , which is obtained by dividing the second rate 
equation by the first: 



dl 
dS 



-1 



N 
aS 



with the solution 



N - 



N 



HS/(N - 1)) 



(4) 



(5) 



It is clear that if a < N/(N — 1) as 1, the rate equa- 
tion predicts that the number of infected individuals de- 
creases monotonically in time (decreasing S), whereas if 
a is greater than this threshold, the number of infected 
individuals first rises, and as S decreases, eventually N/S 
falls below a and / falls till it hits at 5/ satisfying 
j\T - Sf w (N/a)]n(N/Sf). Thus at the classical level, 
a = 1 marks the threshold between an infection that in- 
fects a finite percentage of the population and those that 
fail to spread. 

To study the stochastic process, we adopt a similar 
strategy and eliminate time, focussing solely on the tran- 
sitions between states. We characterize the system by 
the number of transitions the system has undergone. In 
each transition the number of infected individuals cither 
rises or falls by one, so that / undergoes a kind of ran- 
dom walk. After T transitions, S and R are completely 
specified by T and /, with for example 



AT-i(T + 7+l) 



(6) 



The probability of an upward transition is p+ = 
aS/(aS + N), whereas the probability of a downward 
transition is p_ = 1 — p+ — N/ (aS + N). These proba- 
bilities are unequal and depend on / and T, so that the 



2 



walk is biased, with a "time-" and space-dependent drift. 
(From here on, we will colloquially refer to T as time, and 
trust this will not lead to confusion). The form of these 
probabilities simplify at threshold, a = 1, where N — S 
and I are both much smaller than N. Then, 



P± = - T — (T + I) 
F± 2^ 8N y ' 



(7) 



where we have also assumed T is large enough that we 
can ignore the 1. Thus, the drift is, for large populations, 
very weak. 

This formulation immediately gives the well-known an- 
swer for an infinite population, where the bias term van- 
ishes and we have a simple random walk starting at 1 
with a trap at the origin. The distribution of first-passage 
times is 



P(T = 2k + 1) = 2 



-2k-l 



2k 



2k 
k + 1 



which for large T becomes 

P(T = 2fc + l) 



(8) 



(9) 



Our task is to identify how the bias, resulting from the 
reduction of the susceptible pool with time, modifies this 
answer. 

It is straightforward to generate the discrete-time mas- 
ter equation for our biased random walk. Since the bias is 
very weak, however, it is only effective at large times, and 
we are justified in passing to the Fokker-Planck equation 
for the distribution P(I): 



d 1 d 2 



J_d_ 

iNdl 



l(T + I)P] (10) 



One final simplification is to realize that the time- 
dependent drift is more effective than the spatially- 
dependent drift, and so the latter may be dropped. The 
argument is straightforward: The typically " length" scale 
I is proportional to T 1 ' 2 . Thus, the time-dependent 
drift is relevant when T _1 - T 1 / 2 /N, or T ~ TV 2 / 3 . 
The spatially-dependent drift become effective only when 
T _1 ~ l/N or T ~ N, much later than the time- 
dependent drift and so can be neglected. 
Thus the equation we need to solve is 



dP 
df 



ld 2 P 
2'dl 2 



T dP 
AN dl 



(11) 



This equation is difficult to treat in its current form, since 
it is not separable, but becomes so if we define 



p = e -/T/4W-T 3 /(96JV 2 )^ 



so that 



dip 
df 



1 d 2 ip 
2'dl 2 



AN 



if,. 



(12) 



(13) 



with the boundary conditions ip(0,T) = 0, ijj(I,0) = 
5(1 — 1). We can eliminate N from the equation by the 
scaling T = 2a 2 f, I = al, with a = (27V) 1 / 3 , resulting in 
(after dropping the tildes) 



dtp d 2 ip , 



(14) 



with -0(7,0) = 6(1 — l/a)/a. The operator on the right- 
hand side has an spectrum unbounded from above, so we 
need to regularize the problem by imposing an absorbing 
wall at some large L, which we will remove to infinity 
at the end. Clearly, introducing such a wall in the origi- 
nal equation for P has no significant effect, so it cannot 
materially affect our calculation in terms of ip- With this 
rcgularization, the right-hand operator has a well-defined 
discrete spectrum, with eigenvalues E n and normalized 
eigenfunctions (p n . In terms of this, the flux of if) to the 
trap at the origin is given by 



1=0 



2a 



1 dip 

2 af 



(15) 



The eigenfunctions <p n are given by 

<f> n (I) = A n Ai(-x + E n ) + B n Bi(-x + E n ) (16) 
The condition (p n (0) = implies that 

B n = -A n Ai(E n )/Bi(E n ) (17) 
and so, using the fact that the Wronskian of Ai and Bi 

is 1/-7T, 



</4(0) = -A n /(TTBi(E n )) 



(18) 



The normalization condition is 

r-L 



1 



K(I)dI 



i/2 







(</4(L)) 2 -(<^(0)f 



(A 2 n + B n )L^ 2 



(19) 



where we have used the fact that L is large to approxi- 
mate Ai and Bi by their asymptotic expansions for large 
negative arguments. 0] The density of states is easily cal- 
culated from these expansions to be 



dn 
dE 



£1/2 



We thus have, taking away the cutoff, 
1 f°° dE 



27r 2 a 2 7.^ Ai 2 (£) +Bi 2 (£) 



(20) 



(21) 



Translating back to the original units and P, this gives 
our major result for the probability density of extinction 
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of the epidemic at transition T, with S — N — T/2 sus- 
ceptible individuals left, so that n = T/2 individuals in 
all have been infected: 



P(n) 



-n 3 /(12JV 2 ) roc 



dE 



Ai 2 (£) +Bi 2 (£) 



,En/a 2 



(22) 



The first point to note is that this result satisfies the scal- 
ing behavior claimed by BN-K. To understand this result 
in more depth, we compute its asymptotic behavior, first 
for small n. In this limit, the integral is dominated by 
the integrand at large negative E, for which the integrand 
can be approximated by 



1 



Ai 2 (£) + Bi 2 (E) 
This gives the result 



3 BT/(2a 2 ) 



7r (- J B) 1 /2 e i5T/(2a 2 ) (23) 



P(n) 



1 



Ann 3 / 2 



(24) 



as it should, since the drift is not relevant for small n. 
For large n, the integral is dominated by positive E's of 
order n 2 . Then we the integrand is dominated by Bi, and 
the integral becomes 



P(n) 



1 



wE l/2 e -±E^ + En/a 2 dE 



80FiV ; 



■n 3/2 e~ n /{WN ' 



(25) 



Thus, P(n) is sharply cut off for n ^> 0(N 2 / 3 ) in ac- 
cord with the numerical results of BN-K. In Fig. [I] we 
graph P(n), together with the asymptotic formulas for 
small and large T. Also displayed is the exact solution 
of the full master equation for N — 10 3 . We see that 
indeed the finite systems converge nicely to the scaling 
limit, with slower convergence at very small n, where the 
discreteness of n is relevant, and at the largest n where 
our dropped spatially-dependent bias plays a detectable 
role. 

It is straightforward to extend our solution to the near 
threshold case, where a = 1 + S. This introduces an ad- 
ditional constant bias to the problem. Eq. (|13[) remains 
unchanged, where now ip is related to P by 



P 



I(T-2SN) (T -28N) -(2SN)° 



(26) 



so that in the probability distribution for outbreaks, one 
gets an additional factor exp((n 2 8N - n5 2 N 2 )/(3N 2 )) 
The appropriate scale for 5 is clearly N~ x / 3 as noted 
by BN-K. In this context it is interesting to consider the 
average outbreak size as a function of 6. The scaling with 
N of P(n) implies that the average outbreak n scales as 
TV 1 / 3 . While n(5) is given by a double integral, and must 
be computed numerically, the asymptotic behavior for 
large positive and negative J's is accessible. For large 
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FIG. 1: Scaled large-iV probability density NP(n) for out- 
breaks of total size n, versus the scaled outbreak size n/N 2 ^ 3 , 
from Eq. (pTj) . together with the small-n (Eq. (pi]) ) and 
large-n (Eq. (|25|l) asymptotics. Also displayed is the exact 
results for N = 1000. 



positive 5, the double integral has a sharp peak at n 



2SN, E = n7(4a 4 ), giving 



n « 2S 2 N 



(27) 



This is exactly what is needed to match on to the super- 
critical regime. For a - 1 > iV -1 / 3 , on average a finite 
fraction of the entire population is infected before the 
epidemic runs its course. The probability that the epi- 
demic survives to macroscopic proportions is 1 — l/a,[8( 
in which case the deterministic prediction of Sf is reli- 
able. Thus the average outbreak is of size 



1 



-rN, 



(28) 



where r satisfies r + e~ ar — 1, which for a near 1 yields 
Eq. I|27p . The exact results from the master equation for 
the supercritical regime are in excellent agreement with 
Eq. (|28p. except near the threshold region awl (data 
not shown). For large negative 5, on the other hand, 
small n's predominate, and 



dn 'i 



1 



20rn 3 / 2 



= 1/5 



(29) 



This in turn matches on to the subcritical result, n = 
1/(1 — a) as a approaches one from below. Thus, the near 
threshold hold regime interpolates smoothly between the 
sub- and super-threshold domains. In the former, the 
probability distribution is sharply peaked at 0, whereas 
in the latter there are two peaks, one at zero and a second 
at the deterministic value of n. It is in the near-threshold 
regime that this second peak is born and splits off from 
the first. In Fig. [21 we plot n(S) obtained from a numer- 
ical integration of our formula, together with the results 
for N = 10 3 , N = 10 4 , and N = 10 5 . We see that 
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(a-l)V 1/3) 

FIG. 2: n/N 1/3 as a function of (a- l)N 1/a for N = 10 3 , 10 4 , 
and 10 5 , together with a numerical calculation of our large- iV 
analytic formula. 
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FIG. 3: n/[2(l - a) 2 N] as a function of a, for N = 10 3 , 10 4 , 
10 5 and 10 6 . 

the agreement is excellent for small \S\ where the results 
have converged. Convergence is much slower, however, 
for large S. To see the convergence to the asymptotics 
better, we plot in Fig. \5\n/(25 2 N), and we see that the 
data approach the expected value of 1 as S approaches 0, 
only to veer off as the (TV-dependent) threshold regime is 
found. Similarly, in Fig[5J we show (n) together with the 
subcritical prediction, and see how the finite N data fol- 
low the diverging curve as a approaches 1, only to level 
off in the threshold region. 

In summary, we have found an analytic solution of the 
SIR model in the vicinity of the infection threshold, we 
interpolates between the solutions of the sub- and super- 



critical cases. This solution exhibits the scaling proper- 
ties found by BN-K. In addition, we have presented a 
solution to the problem of the first-passage of a random 
walker with a drift which increases linearly in time. 
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FIG. 4: n as a function of a, for N = 10 3 , 10 4 , and 10 s , 
together with the infinite-iV prediction for the subcritical 
regime. 
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